Ref 

# 


Hits 


Search Query 


DBs 


Default 
Operator 


Plurals 


Time Stamp 


LI 


28630 


user$l near (preference$l or 
profile$l) 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBMJTDB 


OR 


ON 


2004/11/29 10:38 


L2 


11163 


(creat$3 or generat$3 or build$3) 
same ("portal web site" or "portal web 
sites" or "web site" or "web sites") 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBMJTDB 


OR 


ON 


2004/11/29 10:41 


L3 


66570 


stor$4 near database 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBMJTDB 


OR 


ON 


2004/11/29 10:40 


L4 


11850 


(construct$4 or creat$3 or generat$3 
or build$3) same ("portal web site" or 
"portal web sites" or "web site" or 
"web sites") 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBMJTDB 


OR 


ON 


2004/11/29 10:43 


L5 


4069 


3 and 4 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBMJTDB 


OR 


ON 


2004/11/29 10:42 


L6 


1335 


1 and 5 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBMJTDB 


OR 


ON 


2004/11/29 10:42 


L7 


23498 


"707"/$.ccls. 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBMJTDB 


OR 


ON 


2004/11/29 10:42 


L8 


19784 


"715"/$.ccls. 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBMJTDB 


OR 


ON 


2004/11/29 10:42 


L9 


41027 


7 or 8 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBMJTDB 


OR 


ON 


2004/11/29 10:42 


L10 


365 


6 and 9 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBMJTDB 


OR 


ON 


2004/11/29 10:43 
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Lll 


11850 


(construct$4 or creat$3 or generat$3 
or build$3) same ("portal web site" or 
"portal web sites" or "web site" or 
"web sites" or "web sites" or "web 
site") 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBM_TDB 


OR 


ON 


2004/11/29 10:45 


L12 


2637 


1 and 11 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBM_TDB 


OR 


ON 


2004/11/29 10:45 


L13 


1335 


5 and 12 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBM_TDB 


OR 


ON 


2004/11/29 10:46 


L14 


365 


9 and 13 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBM_TDB 


OR 


ON 


2004/11/29 10:46 


L15 


1 


14 and (seed near3 data) 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBM_TDB 


OR 


ON 


2004/11/29 10:48 


L16 


1 


14 and (initial near3 appearance$l) 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBM TDB 


OR 


ON 


2004/11/29 10:49 
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(19) United States 

(12) Patent Application Publication (io> Pub, No.: US 2002/0103822 Al 

Miller (43) Pub. Date: Aug. 1, 2002 



(54) METHOD AND SYSTEM FOR 

CUSTOMIZING AN OBJECT FOR 
DOWNLOADING VIA THE INTERNET 

(76) Inventor: Isaac Miller, Givatayim (IL) 

Correspondence Address: 
BROWDY AND NEIMARK, P.L.L.C. 
624 Ninth Street, N.W. 
Washington, DC 20001 (US) 

(21) Appl. No.: 09/773,367 

(22) Filed: Feb. 1, 2001 

(30) Foreign Application Priority Data 

Feb. 1,2001 (IL) 134319 
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(52) U.S.C1 707/501.1 



(57) ABSTRACT 

A computer-implemented method and system for download- 
ing a customized base object to a client machine, wherein a 
Content Provider web server obtains an enhanced object 
created from the base object, the enhanced object having 
associated therewith an identification for allowing an agent 
web server to uniquely identify it, and having embedded 
therein program code for effecting communication with the 
agent web server. The Content Provider web server down- 
loads to a client machine an XHTML page containing a 
reference to the enhanced object so as to allow activation of 
the program code by the client machine and thereby to call 
the agent web server, so as to allow the agent web server to 
download auxiliary data to the client machine in predeter- 
mined association with the enhanced object within a bound- 
ary thereof. 



CONTENT 
PROVIDER 
WEB SERVER 
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OR 
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STOR E EO 
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CONTENT PROVIDER 
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OBJECT AND UPLOADS 
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CONTENT PROVIDER 
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LOCATION & 
DESCRIPTION FOR 
UPLOADED OBJECT 



CONTENT PROVIDER 

REQUESTS AN 
ENHANCED OBJECT 
TO BE CREATED FROM 
THE UPLOADED BASE 
OBJECT 



CONTENT PROVIDER 
REQUEST TO RECEIVE 
THE CREATED EO VIA 
EITHER BROWSER 
DOWNLOAD, FTP, 
E-MAIL ETC 




UPLOAD TO SERVER 



ACKNOWLEDGEMENT 



REQUEST TO 
"GE NERA T E E O 



ACK. THAT EO HAS 



BEEN CREATED AND 
ITS LOCATION 



REQUEST TO 



RECEIVE EO 
EO IS SENT BACK 



(OR DOWNLOADED) 
CCC CODE 



AGENT SERVER 
STORES BASE OBJECT 
AT SPECIFIED LOCATION 
UNDER SPECIFIED NAME 
WITH SUPPLIED 
DESCRIPTION THEN 
SENDS BACK AN 
ACKNOWLEDGEMENT. 



AGENT SERVER 
GENERATE EO EMBEDDING 
IN IT THE BASE OBJECT, A 
•BOOTSTRAP EEOANDA 

UNIQUE ID. ALSO 
EMBEDDED IS THE AGENT 
SERVER IP ADDRESS 



SEND EO AND 
CCC (COMPATIBILITY 
CUSTOMIZATION & 
CONTEXT) 
XHTML CODE TEMPLATE 
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(57) ABSTRACT 



This is a system and method for processing and selectively 
storing content of an Internet web site. A key aspect of each 
variation of the invention is the distillation of information 
associated with an Internet location to which the user has 
browsed using various algorithms operating in the back- 
ground to produce a linked group of distilled pieces of 
information (a "datagram") which may be used in various 
ways for or by the user. 
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Result page: 123456789 

Constructing, organizing, and visualizing collections of topically related Web resources 
Loren Terveen, Will Hill, Brian Amento 

March 1999 ACM Transactions on Computer-Human Interaction (TOCHI), volume 6 issue l 

Full text available: f E) pdf(303.62 KB) Additjonal Information: full citation, abstract, references, citings, index 

terms 

For many purposes, the Web page is too small a unit of interaction and analysis. Web sites 
are structured multimedia documents consisting of many pages, and users often are 
interested in obtaining and evaluating entire collections of topically related sites. Once such 
a collection is obtained, users face the challenge of exploring, comprehending and 
organizing the items. We report four innovations that address these user needs: (1) we 
replaced the Web page with the Web site 

Keywords: cocitation analysis, collaborative filtering, computer supported cooperative 
work, information visualization, social filtering, social network analysis 



Personalizing web sites for mobile users 

Corin R. Anderson, Pedro Domingos, Daniel S. Weld 

April 2001 Proceedings of the tenth international conference on World Wide Web 

Full text available: fjjl pdf(385.99 KB) Additional Information: full citation , references , citings , index terms 



3 Effective Web data extraction with standard XML technologies §§§ 
Jussi Myllymaki 

April 2001 Proceedings of the tenth international conference on World Wide Web 

Full text available: ^ p pdfd 98.81 KB) Additional Information: full citation , references , citings , index terms 

Keywords: crawling, data extraction, deep Web, semistructured data, wrappers 

4 Technical papers: testing I: Improving web application testing with user session data §§§ 
Sebastian Elbaum, Srikanth Karre, Gregg Rothermel 

May 2003 Proceedings of the 25th International Conference on Software Engineering 

Full text available: = Hn 

Tg g-QQlllag MB )^ Additional Information: full citation , abstract , references 
Publisher Site 

Web applications have become critical components of the global information infrastructure, 
and it is important that they be validated to ensure their reliability. Therefore, many 

http://portal.acm.org/results.cfin?coll=ACM&dl=ACM&CFro 11/29/04 
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techniques and tools for validating web applications have been created. Only a few of these 
techniques, however, have addressed problems of testing the functionality of web 
applications, and those that do have not fully considered the unique attributes of web 
applications. In this paper we explore the notion that user s ... 

Ex periments in social data mining: The TopicShop system 
Brian Amento, Loren Terveen, Will Hill, Deborah Hix, Robert Schulman 

March 2003 ACM Transactions on Computer-Human Interaction (TOCHI), volume 10 issue l 

Full text available: ^ pdf(377 92 KB) Additional Information: full citation , abstract , references , citings, index 
^ ! terms 

Social data mining systems enable people to share opinions and benefit from each other's 
experience. They do this by mining and redistributing information from computational 
records of social activity such as Usenet messages, system usage history, citations, or 
hyperlinks. Some general questions for evaluating such systems are: (1) is the extracted 
information valuable? and (2) do interfaces based on the information improve user task 
performance? We report here on TopicShop, a syst ... 

Keywords: Cocitation analysis, collaborative filtering, computer-supported cooperative 
work, information visualization, social filtering, social network analysis 



Information retrieval 2: Dynamic maintenance of web indexes using landmarks 

Lipyeow Urn, Min Wang, Sriram Padmanabhan, Jeffrey Scott Vitter, Ramesh Agarwal 

May 2003 Proceedings of the twelfth international conference on World Wide Web 

Full text available: IS pdf (233.78 KB ) Addjtional Information: full citation, abstract, references, citings, index 
" ™ terms 

Recent work on incremental crawling has enabled the indexed document collection of a 
search engine to be more synchronized with the changing World Wide Web. However, this 
synchronized collection is not immediately searchable, because the keyword index is rebuilt 
from scratch less frequently than the collection can be refreshed. An inverted index is 
usually used to index documents crawled from the web. Complete index rebuild at high 
frequency is expensive. Previous work on incremental inverted in ... 

Keywords: inverted files, update processing 



7 Emergent web patterns: The connectivity sonar: detecting site functionality by 
structural patterns 

Einat Amitay, David Carmel, Adam Darlow, Ronny Lempel, Aya Soffer 
August 2003 Proceedings of the fourteenth ACM conference on Hypertext and 
hypermedia 

Full text available: ^ pdf(153.40 KB) Additional Information: full citation , abstract , references , index terms 

Web sites today serve many different functions, such as corporate sites, search engines, e- 
stores, and so forth. As sites are created for different purposes, their structure and 
connectivity characteristics vary. However, this research argues that sites of similar role 
exhibit similar structural patterns, as the functionality of a site naturally induces a typical 
hyperlinked structure and typical connectivity patterns to and from the rest of the Web. 
Thus, the functionality of Web sites is refle ... 

Keywords: link analysis, web IR, web graphs 



8 Industry track: InfoAnalyzer: a computer-aided tool for building enterprise taxonomies Q 
Li Zhang, ShiXia Liu, Yue Pan, LiPing Yang 

November 2004 Proceedings of the Thirteenth ACM conference on Information and 
knowledge management 

Full text available: Additional Information: 

http://portal.acm.org/resultsxfo?coll=ACM&dl=ACM&C^ 11/29/04 
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|B |pdf(146.36 KB) full citation , abstract , references , index terms 

In this paper we study the problem of collecting training samples for building enterprise 
taxonomies. We develop a computer-aided tool named InfoAnalyzer, which can effectively 
assist the enterprise to prepare large set of samples used for machine learning in text 
categorization. In our system, the enterprise category tree is initially defined by some 
keywords, then the Google search engine is used to construct a small set of labeled 
documents, and topic tracking algorithm based on document I ... 

Keywords: document length normalization, relevance feedback, shannon entropy measure, 
support vector machine, topic tracking 



9 To picShop: enhanced support for evaluating and organizing collections of Web sites j 
Brian Amento, Loren Terveen, Will Hill, Deborah Hix 

November 2000 Proceedings of the 13th annual ACM symposium on User interface 
software and technology 

Full text available: pdf(526.36 KB) Additional Information: full citation , references , citings , index terms 



10 Efficient identification of Web communities 
Gary William Flake, Steve Lawrence, C. Lee Giles 

August 2000 Proceedings of the sixth ACM SIGKDD international conference on 
Knowledge discovery and data mining 

Full text available: ^ pdf(273.37 KB) Additional Information: full citation , references , citings , index terms 



11 Image Retrieval from the World Wide Web: Issues, Techniques, and Systems 
M. L. Kherfi, D. Ziou, A. Bernardi 

March 2004 ACM Computing Surveys (CSUR), volume 36 issue l 

Full text available: ^ pdf(294.13 KB) Additional Information: full citation , abstract , references , index terms 

With the explosive growth of the World Wide Web, the public is gaining access to massive 
amounts of information. However, locating needed and relevant information remains a 
difficult task, whether the information is textual or visual. Text search engines have existed 
for some years now and have achieved a certain degree of success. However, despite the 
large number of images available on the Web, image search engines are still rare. In this 
article, we show that in order to allow people to profi ... 

Keywords: Image-retrieval, World Wide Web, crawling, feature extraction and selection, 
indexing, relevance feedback, search, similarity 



12 Repository architectures: BDBComp: building a digital library for the Brazilian computer J 
science co mmunity 

Alberto H. F. Laender, Marcos Andre Gongalves, Pablo A. Roberto 

June 2004 Proceedings of the 2004 joint ACM/IEEE conference on Digital libraries 

Full text available: ^ pdf(157.14 KB) Additional Information: full citation , abstract , references , index terms 

This paper reports initial efforts towards building BDBComp, a digital library for the Brazilian 
computer science community BDBComp is based on a number of standards (e.g., OAI, 
Dublin Core, SQL) as well as on new technologies (e.g., Web data extraction tools), which 
allowed fast and easy prototyping. The paper focuses on architectural issues and specific 
challenges faced during the construction of this digital library as well as on proposed 
solutions. 

Keywords: DL modeling, OAI, computing digital libraries, national DLs 
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13 Information retrieval session 5: general retrieval issues II: Multi-resolution 
disambiguation of term occurrences 

Einat Amitay, Rani Nelken, Wayne Niblack, Ron Sivan, Aya Soffer 

November 2003 Proceedings of the twelfth international conference on Information and 
knowledge management 

Full text available: ^ pdf(371.34 KB) Additional Information: full citation , abstract , references , index terms 

We describe a system for extracting mentions of terms such as company and product 
names, in a large and noisy corpus of documents, such as the World Wide Web. Since 
natural language terms are highly ambiguous, a significant challenge in this task is 
disambiguating which occurrences of each term are truly related to the right meaning, and 
which are not. We describe our approach for disambiguation, and show that it achieves very 
high accuracy with only limited training. This serves as a necessary ... 

Keywords: disambiguation, information retrieval, natural language processing, text mining 



14 An empirical evaluation of user interfaces for topic management of Web sites 
Brian Amento, Will Hill, Loren Terveen, Deborah Hix, Peter Ju 

May 1999 Proceedings of the SIGCHI conference on Human factors in computing 
systems: the CHI is the limit 

Full text available- 153 pdf(2.07 MB) Additional Information: full citation , abstract , references , citings, index 
' terms 

Topic management is the task of gathering, evaluating, organizing, and sharing a set of web 
sites for a specific topic. Current web tools do not provide adequate support for this task. 
We created the TopicShop system to address this need. TopicShop includes (1) a 
webcrawler that discovers relevant web sites and builds site profiles, and (2) user interfaces 
for exploring and organizing sites. We conducted an empirical study comparing user 
performance with TopicShop vs. Ya ... 

Keywords: computer supported cooperative work, human-computer interaction, 
information access, information retrieval, information visualization, social filtering 



15 Multimedia: OCTOPUS: aggressive search of multi-modality data using multifaceted 
knowledge base 
Jun Yang, Qing Li, Yueting Zhuang 

May 2002 Proceedings of the eleventh international conference on World Wide Web 

Full text available: ^pdf(321.15 KB) Additional Information: full citation , abstract , references , index terms 

An important trend in Web information processing is the support of multimedia retrieval. 
However, the most prevailing paradigm for multimedia retrieval, content-based retrieval 
(CBR), is a rather conservative one whose performance depends on a set of specifically 
defined low-level features and a carefully chosen sample object. In this paper, an aggressive 
search mechanism called Octopus is proposed which addresses the retrieval of multi- 
modality data using multifaceted knowledge. In parti ... 

Keywords: layered graph model, link analysis, multi-modality data, multifaceted knowledge 
base, multimedia retrieval, relevance feedback 



16 Self-similarity in the web 

Stephen Dill, Ravi Kumar, Kevin S. Mccurley, Sridhar Rajagopalan, D. Sivakumar, Andrew 
Tomkins 

August 2002 ACM Transactions on Internet Technology (TOJT), volume 2 issue 3 

Full text available* IS pdf(483 69 KB) Additional Information: full citation , abstract , references , citings , index 
■TH p 1 terms 
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Algorithmic tools for searching and mining the Web are becoming increasingly sophisticated 
and vital. In this context, algorithms that use and exploit structural information about the 
Web perform better than generic methods in both efficiency and reliability. We present an 
extensive characterization of the graph structure of the Web, with a view to enabling high- 
performance applications that make use of this structure. In particular, we show that the 
Web emerges as the outcome of a number of esse ... 

Keywords: Fractal, Web-based services, World-Wide-Web, graph structure, online 
information services, self-similarity 



17 Papers: On the move: From desktop to phonetop: a Ul for web interaction on very 
small devices 

Jonathan Trevor, David M. Hilbert, Bill N. Schilit, Tzu Khiau Koh 

November 2001 Proceedings of the 14th annual ACM symposium on User interface 
software and technology 

Full text available* IS) Ddfd 34 MB) Additional Information: full citation , abstract , references , citings , index 
"■^ terms 

While it is generally accepted that new Internet terminals should leverage the installed base 
of Web content and services, the differences between desktop computers and very small 
devices makes this challenging. Indeed, the browser interaction model has evolved on 
desktop computers having a unique combination of user interface (large display, keyboard, 
pointing device), hardware, and networking capabilities. In contrast, Internet enabled cell 
phones, typically with 3-10 lines of text, sacrifice ... 

Keywords: PDA, Web browsing, transcoding, transducing, web phone, wireless web 



18 Industrial/government track: Golden Path Analyzer: using divide-and-conquer to clu ster Q 
Web clickstreams 
Kamal Ali, Steven P. Ketchpel 

August 2003 Proceedings of the ninth ACM SIGKDD international conference on 
Knowledge discovery and data mining 

Full text available: ^ pdf(433.92 KB) Additional Information: full citation , abstract , references , index terms 

This paper describes a novel algorithm and deployed system Golden Path Analyzer (GPA) 
that analyzes clickstreams of people trying to complete the same task on a website. It finds 
the shortest, successful paths taken by users - 'golden paths 1 - and uses these as seeds for 
clickstream clusters. Other users are assigned to a cluster if their clickstream is a 
supersequence of the golden path. The advantages of this approach are that the resulting 
clusters are easily comprehended, they are few in num ... 

Keywords: Web-mining, clustering, divide-and-conquer 



19 Intelligent crawling on the World Wide Web with arbitrary predicates 
Charu C Aggarwal, Fatima Al-Garawi, Philip S. Yu 

April 2001 Proceedings of the tenth international conference on World Wide Web 

Full text available: ^ pdf(272.60 KB) Additional Information: full citation , references , citings , index terms 




Keywords: World Wide Web, crawling, querying 



20 Web community mining and web log mining: commodity cluster based execution 
Masaru Kitsuregawa, Masashi Toyoda, Iko Pramudiono 

January 2002 Australian Computer Science Communications , Proceedings of the 

thirteenth Australasian conference on Database technologies - Volume 5, 
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Volume 24 Issue 2 

Full text available: pelf (80 1.1 3 KB) Additional Information: full citation , abstract , references , index terms 

The emergence of WWW has drawn new frontiers for database research. Web mining has 
become a hot topic since WWW rapid expansion rate and chaotic nature have exposed some 
technical challenges as well as interesting discoveries. In general web mining can be 
classified into web structure mining and web usage mining. Here we introduce two 
applications of web mining, first from mining the web structure we identify web 
communities, and the second we mine web usage of mobile internet users on location ... 

Keywords: PC cluster, parallel mining, web community, web mining 
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Workshop on , 4-8 Sept. 2000 
Pages: 494 - 498 

[Abstract] fPDF Full-Text (384 KB)] ieeecnf 
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